Towards a MT Evaluation Methodology
نویسنده
چکیده
This paper is based on extensive studies of MT evaluation methods and practical comparative evaluation experience testing major MT systems such as Systran, Metal, Logos, Ariane. Evaluation studies were carried out within the framework of a Ph.D., on behalf of the EC Commission, and the London based computer consultancy OVUM. The paper begins with some general observations on the state of the art of NLP and MT evaluation, followed by a presentation of evaluation methodologies for comparing the quality of the linguistic performance of either subsequent versions of one machine translation system (vertical evaluation) or of several translation systems (horizontal evaluation). Merits and shortcomings of a rudimentary corpus based vertical evaluation methodology are briefly discussed. The horizontal evaluation method introduced consists of a combination of test suites and text samples. Translation quality is measured quantitatively by counting error frequencies. Some recommendations are also made regarding comparative linguistic performance testing of translation workbenches, particularly with regard to linguistic versus statistical fuzzy matching efficiency.
منابع مشابه
Evaluation Methodology and Results for English-to-Arabic MT
This paper describes the evaluation campaign of the MEDAR project for English-to-Arabic (EnAr) MT systems. The campaign aimed at establishing some basic facts about the state of the art for MT on EnAr, collecting enough data to better train and tune systems and assessing the improvements made. The paper details the data used and their formats, the evaluation methodology and the results obtained...
متن کاملAdaptation of the Darpa Machine Translation Evaluation Paradigm to End-to-end Systems
The Defense Advanced Research Projects Agency (DARPA) Machine Translation (MT) Initiative spanned four years. One outcome of this effort was a methodology for evaluating the core technology of MT systems which differ widely in approach, maturity, platform, and language combination. This black box methodology, which proved capable of measuring performance of such diverse systems, used methods wh...
متن کاملA Task-Oriented Evaluation Metric for Machine Translation
Evaluation remains an open and fundamental issue for machine translation (MT). The inherent subjectivity of any judgment about the quality of translation, whether human or machine, and the diversity of end uses and users of translated material, contribute to the difficulty of establishing relevant and efficient evaluation methods. The US Federal Intelligent Document Understanding Laboratory (FI...
متن کاملEvaluation Metrics for Knowledge-Based Machine Translation
A methodology is presented for component-based machine translation (MT) evaluation through causal error analysis to complement existing global evaluation methods. This methodology is particularly appropriate for knowledge-based machine translation (KBMT) systems. After a discussion of MT evaluation criteria and the particular evaluation metrics proposed for KBMT, we apply this methodology to a ...
متن کاملMachine Translation on the Medical Domain: The Role of BLEU/NIST and METEOR in a Controlled Vocabulary Setting
The main objective of our project is to extract clinical information from thoracic radiology reports in Portuguese using Machine Translation (MT) and cross language information retrieval techniques. To accomplish this task we need to evaluate the involved machine translation system. Since human MT evaluation is costly and time consuming we opted to use automated methods. We propose an evaluatio...
متن کامل